Systems and methods for diagnosing equipment

ABSTRACT

A method may include recording operation of equipment to create an audio file, extracting features from the audio file, inputting the extracted features into a machine learning model, and determining with the machine learning model a score indicative of the operation of the equipment. A system may include an audio sensor to record audio of operation of equipment and generate an audio file, and one or more processors. The one or more processors extract features from the audio file, input the extracted features into a machine learning model, and determine with the machine learning model a score indicative of the operation of the equipment.

CROSS REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Provisional Application63/122,312, filed 7 Dec. 2020, the entire disclosure of which isincorporated herein by reference.

BACKGROUND Technical Field

The disclosed subject matter described herein relates to systems andmethods for diagnosing equipment.

Discussion of Art

Equipment, such as parts of vehicles, may be diagnosed to detect partsthat may be damaged or in failure mode. The diagnosis may vary dependingon the individual conducting the diagnosis, which may lead to inaccurateresults. The diagnosis may not take into account previous diagnoseswhich makes it difficult to determine whether the current diagnosis iscorrect. If equipment is diagnosed incorrectly as operating as desired,a failure of the equipment may result in the equipment (e.g., alocomotive) breaking down. Conversely, if a part is inspected andincorrectly diagnosed as operating undesiredly, for example by beingdamaged, defective, or failed, unnecessary replacement of the partresults in removal of the equipment from service and additional repaircosts.

BRIEF DESCRIPTION

In accordance with one example or aspect, a method may include recordingoperation of equipment to create an audio file and extracting featuresfrom the audio file. The method may include inputting the extractedfeatures into a machine learning model and determining with the machinelearning model a score indicative of the operation of the equipment.

In accordance with one example or aspect, a system may include an audiosensor to record audio of operation of equipment and generate an audiofile and one or more processors. The one or more processors extractfeatures from the audio file and input the extracted features into amachine learning model. The one or more processors determine with themachine learning model a score indicative of the operation of theequipment.

In accordance with one example or aspect, a method may include recordingoperation of a component using a recording device to create a file andextracting features from the file to create an extraction file. Themethod may include inputting the extraction file into a machine learningmodel and determining with the machine learning model a score indicativeof the operation of the component based at least in part on theextraction file.

BRIEF DESCRIPTION OF THE DRAWINGS

The inventive subject matter may be understood from reading thefollowing description of non-limiting embodiments, with reference to theattached drawings, wherein below:

FIG. 1 schematically illustrates a system and method for diagnosingequipment according to one embodiment;

FIG. 2 schematically illustrates a system and method for diagnosingequipment according to one embodiment;

FIG. 3 schematically illustrates a system and method for diagnosingequipment according to one embodiment;

FIG. 4 schematically illustrates a method according to one embodiment;and

FIG. 5 schematically illustrates a method according to one embodiment.

DETAILED DESCRIPTION

Embodiments of the subject matter described herein relate to a devicethat can diagnose equipment, such as equipment associated with vehiclesystems, power-generation systems, and construction equipment. Failureto diagnose equipment that is operating in an undesired manner mayresult in a failure of the equipment that results in expensive repairalong with the equipment being out of service while being repaired.Incorrectly diagnosing equipment that is operating as desired asoperating in an undesired manner may result in the equipment being outof service and unnecessary repair.

Diagnosing equipment by sound currently relies on an operator who isfamiliar with the operation of the equipment being able to recognizeundesired operation by listening to the equipment and making a judgmentbased on experience. Diagnosing the equipment in this manner may notdetect undesired operation early enough to prevent failure of theequipment and may lead to inconsistent results based on the experienceof various operators. The embodiments described herein provide a way todiagnose equipment based on audio data and image data generated from theaudio data by using a machine learning model that is trained todistinguish desired operation from undesired operation and continue tolearn and improve its ability to distinguish desired from undesiredoperation.

While one or more embodiments are described in connection with a railvehicle system, not all embodiments are limited to rail vehicle systems.Further, embodiments described herein extend to multiple types ofvehicle systems. Suitable vehicle systems may include a rail vehicle,automobiles, trucks (with or without trailers), buses, marine vessels,aircraft, mining vehicles, agricultural vehicles, and off-highwayvehicles. Suitable vehicle systems described herein can be formed from asingle vehicle. In other embodiments, the vehicle system may includemultiple vehicles that move in a coordinated fashion. A suitable vehiclesystem may be a rail vehicle system that travels on tracks, or a vehiclesystem that travels on roads or paths. With respect to multi-vehiclesystems, the vehicles can be mechanically coupled with each other (e.g.,by couplers) or they may be virtually or logically coupled but notmechanically coupled. For example, vehicles may be communicatively butnot mechanically coupled when the separate vehicles communicate witheach other to coordinate movements of the vehicles with each other sothat the vehicles travel together (e.g., as a convoy, platoon, swarm,fleet, and the like).

With regard to the equipment or component, suitable examples may includeequipment that is subject to periodic diagnosis. In one embodiment, thecomponent may be an engine or a component of the vehicle system. Forexample, the equipment may be a high-pressure fuel pump for an engine ofa locomotive. In another example, the component may be an electricalmotor. Rotating equipment, generally, is amenable to diagnosis using theinventive methods.

Referring to FIG. 1, a piece of equipment 10 may be diagnosed by anaudio recording device 12. According to one embodiment, the audiorecording device may be a mobile, handheld device. The audio recordingdevice may be a smartphone or a tablet or a personal digital assistant(PDA) or a laptop computer. The audio recording device includes an audiocapture device, e.g., a microphone or an accelerometer or a probe, thatcaptures audio indicative of operation of the piece of equipment or apart of the piece of equipment that is to be diagnosed and store theaudio as a raw audio file 14. According to one example, the raw audiofile may be a WAV format file. The WAV file may include un-containerizedand uncompressed audio data. The audio recording device may be connectedto a sensor or a probe or an external microphone, for example by a USBconnection. The sensor or probe or microphone may be placed in proximityand/or in contact with the equipment part to generate the raw audiofile. The audio recording device may communicate the audio file to oneor more processors which may execute instructions stored in a memory touse a machine learning model to make determinations and evaluationsregarding the equipment or a part of the equipment. For example, thedetermination may relate to whether the equipment or the part of theequipment is operating in a desired mode or an undesired mode. Withregard to close proximity, the distance may be selected with referenceto application specific parameters. In one embodiment, proximity may bewithin a few inches of the microphone to a portion of the equipment orthe part of the equipment.

Suitable audio files may include lossy and non-lossy file types. Exampleof audio file types may include .wav, .mp3, .wma, .aac, .ogg, .midi,.aif, .aifc, .aiff, .au, and .ea. File type may be selected based atleast in part on the compression ratio, compression algorithm, and otherapplication specific parameters.

According to an example, the equipment is a high-pressure fuel pump of avehicle. The raw audio file may be generated while the locomotive engineis running in an idle (i.e., unloaded) condition. The audio recordingdevice may be placed in close proximity to the high-pressure fuel pumpand the audio recording device or the audio capture device may be movedbetween different locations (e.g., from a first recording location 16,to a second recording location 18, to a third recording location 20, andso on). While the illustrated example shows recording in threelocations, optionally, recordings may occur at fewer locations (e.g., asingle location or two locations) or more than three locations. As shownin FIG. 1, the recording locations extend from the top to the bottom ofthe equipment. The audio recording device or audio capture device may behovered over each of the recording locations for a period of time as theaudio recording device or the audio capture device is moved from thefirst to the second to the third recording location. According to anexample, the operation of the high-pressure fuel pump may be recordedfor a period of time, for example 30 seconds, one minute, or anotherlength of time. The audio recording device or audio capture device maybe used to output two or more audio files. For example, the audiorecording device or audio capture device may output a first audio fileof a first fuel pump on a first side of a vehicle and may capture asecond audio file of a second fuel pump on a second, opposite side ofthe vehicle.

Referring to FIG. 2, a method 24 for diagnosing equipment according toone embodiment includes extracting 26 features from the raw audio file.The features that may be extracted include, but are not limited to, oneor more of a zero cross rate, a spectral centroid, a spectral bandwidth,a root mean square error (RMSE), a chroma STFT (short time Fouriertransform), or a spectral roll off. The extracted features are inputinto an input layer 30 of a machine learning model 28. The one or moreprocessors of the audio recording device may transform or convert theraw audio file to image data 22, e.g., a mel spectrogram. The audio dataof the audio file may be transformed into the image data of the melspectrogram using a Fast Fourier Transform (FFT) using, for example awindow function having a determined window size. The analysis may use adetermined hop size to sample the audio file a determined number oftimes in between successive windows. The FFT for each window may becomputed to transform from the time domain to the frequency domain. Themel scale may be generated by separating the entire frequency spectruminto a determined number of evenly spaced frequencies. The spectrogrammay then be generated by, for each window, decomposing the magnitude ofthe signal into its components, the components corresponding to thefrequencies in the mel scale. According to one example, the melspectrogram may be a 23 feature set.

The mel spectrogram may be input into the input layer with the extractedfeatures. According to an example, the machine learning model includesthe input layer, a hidden layer 32, and an output layer 36. As shown inFIG. 2, according to an example, the hidden layer is a single layer andthe output layer is a single output neuron. According to other examples,the hidden layer may include multiple hidden layers and/or the outputlayer may include multiple neurons. In one embodiment, the audio filemay be visualized as a wave function. A Fourier Transform may be usedfor extraction of information from the audio file, the video file, orboth. In other embodiments, other transform algorithms may be employed.Suitable transformations models may include Laplace transforms, Wavelettransforms, and Kramers-Kronig transforms.

The algorithm of the machine learning model applies an input bias 34 tothe inputs of the hidden layer and directs them through an activationfunction as the output. According to one embodiment, the algorithmapplies an output bias 38 to the output which is provided as an anomalymeasure indicative of the operation of the equipment. In one embodiment,the machine learning model is a supervised machine learning model. Themachine learning model may be provided with training data that islabeled. The training data may include audio files and image data of theoperation of equipment operating in a desired mode and audio files andimage data of the equipment operating in an undesired mode. The trainingdata may be from similar equipment, for example from other high-pressurefuel pumps. The training data may be from one or more previous diagnosesof the same equipment. For example, the machine learning model mayinclude previous data of a high-pressure fuel pump and determine thatthe high-pressure fuel pump has been diagnosed a previous number oftimes. The machine learning model may include the previous data of theprevious diagnoses. The machine learning model may determine that apiece of equipment or a part of a piece of equipment that has beenpreviously diagnosed a certain number of times and determined to beoperating in a desired mode may be more likely to be operating in anundesired mode. The machine learning model may also determine from theinput data that the equipment being diagnosed is older than otherequipment that has been diagnosed and thus determine a degradation ofthe equipment part over time.

According to an example, the machine learning model may referenceresults of the model concurrently with the recording of the operation ofthe equipment to provide more accurate decision making. Referring againto FIG. 1 as the audio recording device is moved from one recordinglocation to another recording location the results at one or moreprevious recording locations may be used at the next recording locationas a concurrent reference point. As the audio recording device or theaudio capture device is moved from, for example, cylinder to cylinder inan engine or from cylinder to cylinder in a pump or from pump to pump inthe case of multiple pumps, the algorithm of the machine learning modelmay reference the prior equipment part behaviors and assessments and mayadjust the thresholds concurrently with the recording specific to theequipment being diagnosed. The machine learning model may adjust priorassessments of equipment and equipment parts after completion of theevaluation of the entire equipment.

The machine learning model may be stored in a memory of the audiorecording device and executed by the one or more processors. The memoryof the audio recording device may store the training data and the inputdata of previous diagnoses, either from diagnoses previously performedby the audio recording device or from other audio recording devices. Theinput data for the machine learning model is labeled and structured andthrough operation of the hidden layer the machine learning model detectspatterns in the input data and detects any anomaly in the patterns.According to one embodiment, the output 40 of the machine learning modelis a single value score indicative of the operation of the equipment ona spectrum of single value scores.

Referring to FIG. 3, a process 42 for diagnosing equipment according toan example includes determining a standard score. The raw audio file isprocessed into a normalized audio file 44 and features are extractedfrom the normalized audio file. According to an example, the featuresthat are extracted correspond to the features described with respect toFIG. 2. According to an example, the extracted audio features areprovided in a feature set in a comma-separated values format.

The extracted features are input into a machine learning model 46.According to one embodiment, the machine learning model corresponds tothe machine learning model described with respect to FIG. 2. The outputof the machine learning model is entered into a loss function 48. Theloss function determines a maximum likelihood of normal data. The lossfunction has a scoring component 50 that assigns scores to the databased on a Gaussian distribution.

The scores that are assigned by the loss function are standard scores.Normal data will be assigned standard scores that fall within a range.According to one embodiment, the range is from between −3 to 3. Astandard score falling outside of the range is indicative of anomalousdata. A standard score with a first range of values may be indicative ofoperation of the equipment in a desired mode and a standard score with asecond range of values may be indicative of operation of the equipmentin an undesired mode. According to one embodiment, a standard scorewithin the second range may be higher than a standard within the firstrange. A derivative of an error determined by the loss function 52 isdetermined and provided to the machine learning model. The machinelearning model learns to fit the data and predict a score which assignsminimum value to normal data and statistically higher value foranomalous data.

Referring to FIG. 4, a method 400 may include a step 410 of recordingoperation of equipment to create an audio file and a step 420 ofextracting features from the audio file. The method includes a step 430of inputting the extracted features into a machine learning model and astep 440 of determining with the machine learning model a scoreindicative of the operation of the equipment.

Referring to FIG. 5, a method 500 may include a step 510 of recordingoperation of a component using a recording device to create a file andstep 520 of extracting features from the file to create an extractionfile. The method may include a step 530 of inputting the extraction fileinto a machine learning model and a step 540 of determining with themachine learning model a score indicative of the operation of thecomponent based at least in part on the extraction file.

A method may include recording operation of equipment to create an audiofile and extracting features from the audio file. The method may includeinputting the extracted features into a machine learning model anddetermining with the machine learning model a score indicative of theoperation of the equipment.

The method may include generating at least one of a recommendation formodifying operation of the equipment, a request for maintenance of theequipment, and a scheduled future date for conducting a subsequentrecording with a notice that the score of the equipment falls withindetermined acceptable operational parameters.

The machine learning model may include training data including firstaudio files of desired modes of operation of equipment and second audiofiles of undesired modes of operation of equipment. The method mayinclude adding the audio file to the training data.

The method may include normalizing data of the audio file prior toextracting the features.

The method may include entering the score into a loss function, and theloss function is based on a gaussian distribution of the features thatare extracted and input into the machine learning model.

The score may be a standard score. A standard score within a first rangemay indicate a desired mode of operation of the equipment and a standardscore within a second range may indicate an undesired mode of operationof the equipment.

The features that are extracted from the audio file may include one ormore of a zero cross rate, a spectral centroid, a root mean squareerror, and a spectral roll off

The method may include generating image data corresponding to the audiofile, inputting the generated image data into the machine learning modelconcurrently with the recording audio operation, and determining thescore based at least in part on the generated image data.

A system may include an audio sensor to record audio of operation ofequipment and generate an audio file, and one or more processors. Theone or more processors extract features from the audio file, input theextracted features into a machine learning model, and determine with themachine learning model a score indicative of the operation of theequipment.

The one or more processors, using the score, may generate at least oneof a recommendation for modifying operation of the equipment, a requestfor maintenance of the equipment, and a scheduled future date forconducting a subsequent recording with a notice that the score of theequipment falls within determine acceptable operational parameters.

The machine learning model may include training data including firstaudio files of desired modes of operation of equipment and second audiofiles of undesired modes of operation of equipment. The one or moreprocessors may add the audio file to the training data.

The one or more processors may normalize data of the audio file prior toextracting the features.

The one or more processors may enter the score into a loss function. Theloss function may be based on a gaussian distribution of the featuresthat are extracted and input into the machine learning model.

The score may be a standard score. A standard score within a first rangemay indicate a desired mode of operation of the equipment and a standardscore within a second range indicate an undesired mode of operation ofthe equipment.

The features that are extracted from the audio file may include one ormore of a zero cross rate, a spectral centroid, a root mean squareerror, and a spectral roll off

The extracted features may be video features. The one or more processorsmay generate image data based at least in part on the extracted featuresof the audio file, input the generated image data into the machinelearning model, and determine the score based at least in part on thegenerated image data.

Both the audio file and the image data may be concurrently input intothe machine learning model.

The component may be at least one of a fuel pump, a water pump, a crankshaft, and a piston.

A method may include recording operation of a component using arecording device to create a file and extracting features from the fileto create an extraction file. The method may include inputting theextraction file into a machine learning model and determining with themachine learning model a score indicative of the operation of thecomponent based at least in part on the extraction file.

The extraction files may include audio files, video files, or both audioand video files and the machine learning model may include training dataincluding first reference files of desired modes of operation of thecomponent and second reference files of undesired modes of operation ofthe component. The method may include adding the extraction file to thetraining data as either a desired mode of operation or an undesired modeof operation.

In one embodiment, the processors may determine more graduated dataabout the component. That is, rather than whether it is operating in adesired or undesired state but further the degree to which it isoperating in such state. The score may be on a graduated scale, and itmay correspond to expected remaining useful life of the component. Thatinformation, then, may be used to schedule maintenance, repair orreplacement at a future date that is prior to a calculated failure date.The calculated failure date may have margins of error. The margin oferror may be determined, on one example, on the criticality of thecomponent and the impact of its failure. In one embodiment, thatinformation may be used to modify operation of the component. Forexample, if the component is used in less stressful duty cycles it maylast longer than if it is used to maximum capability.

In one embodiment, the one or more processors or systems describedherein may have a local data collection system deployed and may usemachine learning to enable derivation-based learning outcomes. The oneor more processors may learn from and make decisions on a set of data(including data provided by the various sensors), by making data-drivenpredictions and adapting according to the set of data. In embodiments,machine learning may involve performing a plurality of machine learningtasks by machine learning systems, such as supervised learning,unsupervised learning, and reinforcement learning. Supervised learningmay include presenting a set of example inputs and desired outputs tothe machine learning systems. Unsupervised learning may include thelearning algorithm structuring its input by methods such as patterndetection and/or feature learning. Reinforcement learning may includethe machine learning systems performing in a dynamic environment andthen providing feedback about correct and incorrect decisions. Inexamples, machine learning may include a plurality of other tasks basedon an output of the machine learning system. In examples, the tasks maybe machine learning problems such as classification, regression,clustering, density estimation, dimensionality reduction, anomalydetection, and the like. In examples, machine learning may include aplurality of mathematical and statistical techniques. In examples, themany types of machine learning algorithms may include decision treebased learning, association rule learning, deep learning, artificialneural networks, genetic learning algorithms, inductive logicprogramming, support vector machines (SVMs), Bayesian network,reinforcement learning, representation learning, rule-based machinelearning, sparse dictionary learning, similarity and metric learning,learning classifier systems (LCS), logistic regression, random forest,K-Means, gradient boost, K-nearest neighbors (KNN), a priori algorithms,and the like. In embodiments, certain machine learning algorithms may beused (e.g., for solving both constrained and unconstrained optimizationproblems that may be based on natural selection). In an example, thealgorithm may be used to address problems of mixed integer programming,where some components restricted to being integer-valued. Algorithms andmachine learning techniques and systems may be used in computationalintelligence systems, computer vision, Natural Language Processing(NLP), recommender systems, reinforcement learning, building graphicalmodels, and the like. In an example, machine learning may be used makingdeterminations, calculations, comparisons and behavior analytics, andthe like.

In one embodiment, the one or more processors may include a policyengine that may apply one or more policies. These policies may be basedat least in part on characteristics of a given item of equipment orenvironment. With respect to control policies, a neural network canreceive input of a number of environmental and task-related parameters.These parameters may include, for example, operational input regardingoperating equipment, data from various sensors, location and/or positiondata, and the like. The neural network can be trained to generate anoutput based on these inputs, with the output representing an action orsequence of actions that the equipment or system should take toaccomplish the goal of the operation. During operation of oneembodiment, a determination can occur by processing the inputs throughthe parameters of the neural network to generate a value at the outputnode designating that action as the desired action. This action maytranslate into a signal that causes the vehicle to operate. This may beaccomplished via back-propagation, feed forward processes, closed loopfeedback, or open loop feedback. Alternatively, rather than usingbackpropagation, the machine learning system of the controller may useevolution strategies techniques to tune various parameters of theartificial neural network. The controller may use neural networkarchitectures with functions that may not always be solvable usingbackpropagation, for example functions that are non-convex. In oneembodiment, the neural network has a set of parameters representingweights of its node connections. A number of copies of this network aregenerated and then different adjustments to the parameters are made, andsimulations are done. Once the output from the various models areobtained, they may be evaluated on their performance using a determinedsuccess metric. The best model is selected, and the vehicle controllerexecutes that plan to achieve the desired input data to mirror thepredicted best outcome scenario. Additionally, the success metric may bea combination of the optimized outcomes, which may be weighed relativeto each other.

As used herein, the terms “processor” and “computer,” and related terms,e.g., “processing device,” “computing device,” and “controller” may benot limited to just those integrated circuits referred to in the art asa computer, but refer to a microcontroller, a microcomputer, aprogrammable logic controller (PLC), field programmable gate array, andapplication specific integrated circuit, and other programmablecircuits. Suitable memory may include, for example, a computer-readablemedium. A computer-readable medium may be, for example, a random-accessmemory (RAM), a computer-readable non-volatile medium, such as a flashmemory. The term “non-transitory computer-readable media” represents atangible computer-based device implemented for short-term and long-termstorage of information, such as, computer-readable instructions, datastructures, program modules and sub-modules, or other data in anydevice. Therefore, the methods described herein may be encoded asexecutable instructions embodied in a tangible, non-transitory,computer-readable medium, including, without limitation, a storagedevice and/or a memory device. Such instructions, when executed by aprocessor, cause the processor to perform at least a portion of themethods described herein. As such, the term includes tangible,computer-readable media, including, without limitation, non-transitorycomputer storage devices, including without limitation, volatile andnon-volatile media, and removable and non-removable media such asfirmware, physical and virtual storage, CD-ROMS, DVDs, and other digitalsources, such as a network or the Internet.

Where any or all of the terms “comprise”, “comprises”, “comprised” or“comprising” are used in this specification (including the claims) theyare to be interpreted as specifying the presence of the stated features,integers, steps or components, but not precluding the presence of one ormore other features, integers, steps or components.

The singular forms “a”, “an”, and “the” include plural references unlessthe context clearly dictates otherwise. “Optional” or “optionally” meansthat the subsequently described event or circumstance may or may notoccur, and that the description may include instances where the eventoccurs and instances where it does not. Approximating language, as usedherein throughout the specification and clauses, may be applied tomodify any quantitative representation that could permissibly varywithout resulting in a change in the basic function to which it may berelated. Accordingly, a value modified by a term or terms, such as“about,” “substantially,” and “approximately,” may be not to be limitedto the precise value specified. In at least some instances, theapproximating language may correspond to the precision of an instrumentfor measuring the value. Here and throughout the specification andclauses, range limitations may be combined and/or interchanged, suchranges may be identified and include all the sub-ranges containedtherein unless context or language indicates otherwise.

This written description uses examples to disclose the embodiments,including the best mode, and to enable a person of ordinary skill in theart to practice the embodiments, including making and using any devicesor systems and performing any incorporated methods. The claims definethe patentable scope of the disclosure, and include other examples thatoccur to those of ordinary skill in the art. Such other examples areintended to be within the scope of the claims if they have structuralelements that do not differ from the literal language of the claims, orif they include equivalent structural elements with insubstantialdifferences from the literal language of the claims.

What is claimed is:
 1. A method, comprising: recording operation ofequipment to create an audio file; extracting features from the audiofile; inputting the extracted features into a machine learning model;and determining with the machine learning model a score indicative ofthe operation of the equipment.
 2. The method of claim 1, furthercomprising generating at least one of a recommendation for modifyingoperation of the equipment, a request for maintenance of the equipment,and a scheduled future date for conducting a subsequent recording with anotice that the score of the equipment falls within determinedacceptable operational parameters.
 3. The method of claim 1, wherein themachine learning model comprises training data including first audiofiles of desired modes of operation of equipment and second audio filesof undesired modes of operation of equipment, the method furthercomprising: adding the audio file to the training data.
 4. The method ofclaim 1, further comprising normalizing data of the audio file prior toextracting the features.
 5. The method of claim 1, further comprising:entering the score into a loss function, and the loss function is basedon a gaussian distribution of the features that are extracted and inputinto the machine learning model.
 6. The method of claim 1, wherein thescore is a standard score, and a standard score within a first range isindicative of an desired mode of operation of the equipment and astandard score within a second range is indicative of an undesired modeof operation of the equipment.
 7. The method of claim 1, wherein thefeatures that are extracted from the audio file comprise one or more ofa zero cross rate, a spectral centroid, a root mean square error, and aspectral roll off
 8. The method of claim 1, further comprising:generating image data corresponding to the audio file; inputting thegenerated image data into the machine learning model concurrently withthe recording audio operation; and determining the score based at leastin part on the generated image data.
 9. A system, comprising: an audiosensor configured to record audio of operation of equipment and therebyto generate an audio file; and one or more processors configured to:extract features from the audio file; input the extracted features intoa machine learning model; and determine with the machine learning modela score indicative of the operation of the equipment.
 10. The system ofclaim 9, wherein the one or more processors, using the score, generateat least one of a recommendation for modifying operation of theequipment, a request for maintenance of the equipment, and a scheduledfuture date for conducting a subsequent recording with a notice that thescore of the equipment falls within determine acceptable operationalparameters.
 11. The system of claim 9, wherein the machine learningmodel comprises training data including first audio files of desiredmodes of operation of equipment and second audio files of undesiredmodes of operation of equipment, the one or more processors beingfurther configured to: add the audio file to the training data.
 12. Thesystem of claim 9, wherein the one or more processors are furtherconfigured to normalize data of the audio file prior to extracting thefeatures.
 13. The system of claim 9, wherein the one or more processorsare further configured to: enter the score into a loss function, whereinthe loss function is based on a gaussian distribution of the featuresthat are extracted and input into the machine learning model.
 14. Thesystem of claim 9, wherein the score is a standard score, and a standardscore within a first range is indicative of a desired mode of operationof the equipment and a standard score within a second range isindicative of an undesired mode of operation of the equipment.
 15. Thesystem of claim 9, wherein the features that are extracted from theaudio file comprise one or more of a zero cross rate, a spectralcentroid, a root mean square error, and a spectral roll off
 16. Thesystem of claim 9, wherein the extracted features are video features andthe one or more processors are futher configured to: generate image databased at least in part on the extracted features of the audio file;input the generated image data into the machine learning model; anddetermine the score based at least in part on the generated image data.17. The system of claim 16, wherein both the audio file and the imagedata are concurrently inputted into the machine learning model.
 18. Thesystem of claim 9, wherein the component is at least one of a fuel pump,a water pump, a crank shaft, and a piston.
 19. A method, comprising:recording operation of a component using a recording device to create afile; extracting features from the file to create an extraction file;inputting the extraction file into a machine learning model; anddetermining with the machine learning model a score indicative of theoperation of the component based at least in part on the extractionfile.
 20. The method of claim 19, wherein the extraction files includeaudio files, video files, or both audio and video files and the machinelearning model comprises training data including first reference filesof desired modes of operation of the component and second referencefiles of undesired modes of operation of the component, the methodfurther comprising adding the extraction file to the training data aseither a desired mode of operation or an undesired mode of operation.